Corpus: bel-by_web_2015_100K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 90 92 94 95 95
1000 785 910 942 954 964
10000 6859 9155 9627 9748 9794
100000 42177 79816 92914 96935 98052
1000000 42178 79817 92915 96936 98053


Zipf's diagram for sentence endings


Gnuplot diagram

6543 msec needed at 2018-04-04 16:42